rmacroRDM

adventures in research data management and open source software.


Anna Krystalli | @annakrystalli

Sheffield R users group - 21st June 2016

Outline

  • Present the functionality of the under-development package

The functions in the package enable easier handling and compiling of biological trait data.


  • Discuss experience so far of developing an open project

Background

code developed working on my last two projects:

Sex Roles in Birds
Bird trait networks


  • tracking data provenance and specificity key
  • code & data major outputs

Background

The basic premise has been pretty consistent:

Here's a master data sheet, here's bunch more data in various formats and with varied reference information, here's some more open sources of data, put it all together and prepare it for for analysis.


2 iterations of development

Mozilla Working Open Workshop

Feb 2016 - Berlin


2 day workshop on setting up and running an open project.

Mozilla Working Open Workshop

Aim: run a project openly at the Global Sprint

Run up to Global Sprint

  • biweekly mentoring
  • prepared github folder and registered it to the sprint repo

lesson: documenting your code makes better code

Repo structure

Global Sprint

Outcomes: repo

Outcomes: feedback

Outcomes: feedback

Outcomes: feedback

Outcomes: context

- BAAD

a Biomass And Allometry Database for woody plants

- traits

R client for various sources of species trait data.

- BETYdb

The Biofuel Ecophysiological Traits and Yields Database (BETYdb), a database of plant trait and yield data

Outcomes: coral traits

Why working open

  • opening up makes better code
  • being visible helps linking up to relevant projects
  • being visible increases feedback
  • building stronger networks and communities


= science wins!

In that spirit

Get in touch